Extracting word-pronunciation pairs from comparable set of text and speech
نویسندگان
چکیده
One of the problems in text-to-speech (TTS) systems and speech-to-text (STT) systems is pronunciation estimation of unknown words. In this paper, we propose a method for extracting unknown words and their pronunciations from similar sets of Japanese text data and speech data. Out-of-vocabulary words are extracted from text with a stochastic model and pronunciations hypotheses are generated. These entries are verified by conducting automatic speech recognition on audio data. In this work, we use news articles and broadcast TV news covering similar topics. Most extracted pairs turned out to be correct according to a human judges. We also tested the TTS frontend enhanced with these entries on other web news articles, and observed an improvement in the pronunciation estimation accuracy of 9.2% (relative). The proposed method can be used to realize a spoken language processing system that acquires and updates its lexicon automatically.
منابع مشابه
Incorporating Pronunciation Variation into Extraction of Transliterated-term Pairs from Web Corpora
A novel approach to automatically extracting transliterated-term pairs from Web corpora is proposed in this paper. One of the most important issues addressed is that of taking pronunciation variation into account. Pronunciation variation is a phenomenon of pronunciation ambiguity that seriously affects the term transliteration and hence affects those results produced by transliteration processe...
متن کاملPronunciation dependent language models
Speech recognition systems are conventionally broken up into phonemic acoustic models, pronouncing dictionaries in terms of the phonemic units in the acoustic model and language models in terms of lexical units from the pronouncing dictionary. Here we explore a new method for incorporating pronunciation probabilities into recognition systems by moving them from the pronouncing lexicon into the ...
متن کاملNew word learning for spoken document processing through discovery of comparable texts from external resources
This paper presents a new out-of-vocabulary (OOV) word learning approach that dynamically extends the pronunciation lexicon and the language model for large vocabulary continuous speech recognition (LVCSR) in spoken document retrieval (SDR) systems. Based on the assumption that the graphemes as well as the n-gram statistics of the OOV words can be effectively learned from other contemporary or ...
متن کاملPredicting Word Pronunciation in Japanese
This paper addresses the problem of predicting the pronunciation of Japanese words, especially those that are newly created and therefore not in the dictionary. This is an important task for many applications including text-to-speech and text input method, and is also challenging, because Japanese kanji (ideographic) characters typically have multiple possible pronunciations. We approach this p...
متن کاملKorean large vocabulary continuous speech recognition with morpheme-based recognition units
In Korean writing, a space is placed between two adjacent word-phrases, each of which generally corresponds to two or three words in English in a semantic sense. If the word-phrase is used as a recognition unit for Korean large vocabulary continuous speech recognition (LVCSR), the out-of-vocabulary (OOV) rate becomes very large. If a morpheme or a syllable is used instead, a severe inter-morphe...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2008